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Computer Science and Engineering 


Abstract 


We study the problem of translating updates of database views. We disambiguate a view 
update by requiring that a specified view complement (i.e. a second view which contains all the 
database information omitted from the given view) remains constant during the translation. We 
study some of the computational problems related to the application of this general methodology 


in the context of relational databases. 


We consider, for the most part, databases consisting of a single relation, with functional 
dependencies as the only integrity constraints; we also restrict our attention to views defined by 
projections. We first give a characterization of complementary views (valid also in the presence 
of join dependencies), which leads to efficient algorithms for checking if two given views are 
complementary and for determining a non-redundant complement of a given view. We also 


show that the problem of finding a minimum complement of a given view is NP-complete. 


We then study in detail the problem of translating the insertion of a tuple into a view. We 
show how to do the translation in case the insertion is translatable, and we also develop a 
polynomial time algorithm for testing translatability; we also give two stronger, more efficient 
translatability tests. We show lower bounds for the complexity of the translatability problem, by 


proving that it becomes IT f-hard if the view is given in an exponentially succinct way; an 


analogous result is shown for one of the stronger tests. We also examine the problem of 
determining a complement which renders a given insertion translatable; we find that it can be 
solved in time polynomial in the view, but becomes NP-hard if the view is given in an 


exponentially succinct way; again, analogous results are valid for the stronger tests. 


The above results are extended, in a straightforward way, to the cases of deletion and 
replacement of a tuple. Finally, we define and study a new kind of functional dependencies 
which is important in the context of complements, the explicit functional dependencies (EFD’s), 
which intuitively state that some part of the database information can be computed from the 
rest. We examine the interaction of EFD’s with functional dependencies and join dependencies, 
and we also extend our characterization of complementary views to allow for the presence of 


EFD’s. 
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1. Introduction 


LI. The Problem and the Approach 


In database systems, the amount and structure of the stored data is decided by the database 
administrator. However, individual users often want to deal with only part of the information in 
the database, and moreover they may want to restructure it in a way suitable to their needs. For 
this reason, database systems often provide the view facility. A view is defined by giving a query 
on the whole database. At any point, the contents of the view is just the outcome of this query. 
The user queries and updates the view as though it were a database in itself, with no reference 
to the underlying database. The view idea spares the user from the conceptual complexities of 
the whole database, makes queries easier by "factoring out" a common subexpression, and can 
serve as a protection mechanism by restricting access to only insensitive information. A view 
facility is an important part of many relational database systems, e.g. PRTV [T], QBE [ZI], 
System R [As] and INGRES [SWKH] (as well as of database systems designed along the lines of 
the network data model, like DBTG [CO], or the hierarchical data model, like IMS [D, I]. 

In relational database systems, a view is in general implemented by naming and storing its 
_ definition, which is just a query definition in the query language of the system. Queries on the 
view are translated into database queries by composing them with the view definition. Thus, 
querying a view presents no serious conceptual problems. 

What is much more complex is the subject of updating a view. A simple update operation, 
such as inserting a tuple in the view, may create formidable problems. The underlying database 
update may be ambiguous, ill-defined, create inconsistencies in the database, or have side-effects 
on the view. This problem seems to be related to such fundamental issues as null values [Co4, 
Za2] and update anomalies [Col, Co3, BBG] in relational databases. Most existing systems do not 
allow updates of views (e.g. PRTV, QBE), or allow them only in the trivial case in which the 
view consists of one of the database relations. This omission apparently reflects our poor 


understanding of the subject. 


In one of the first works dealing with view updates, Dayal and Bernstein [DB] stipulated a 
notion of correct translation of a view update, and gave some straightforward conditions for the 
existence of such translations . From this and subsequent works, e.g. [RS, Ca, FSD], it became 
apparent that we need a method for assigning semantics to view updates. This method should be 
formal (resolving the delicate ambiguities involved) and simple (so the users would define the 
semantics themselves, perhaps with the aid of the query system). 

An excellent solution to this problem was suggested in the work of Bancilhon and Spyratos 
[BS, Sp]. They developed an elegant theory (quite independent of the relational model) of 
database mappings, i.e. functions from database states to database states. A view v is such a 
mapping, and so is an update wu on the vicw. How can we translate u? The translation, 7,,, must 
be such that the updated database maps via v into the updated view. As may be suspected, there 
are typically many Ts, so the problem remains. Bancilhon and Spyratos resolve this ambiguity 
by the notion of the complement of a view. A complement of v is another view v’, such 
that the mapping s->(ws), v'(s)) (where s denotes the database state) is one-to-one. In other 
words, any information lost by v can be recovered by v’. A view has many complements (for 
example, the identity mapping is a complement of all views). Choosing a complement that must 
remain constant assigns unambiguous semantics to a view update. The scenario is the following: A 
user defines a view. Before updating the view, the user must define (probably with the assistance 
of the system) another view (a complement of the first), which must be held constant during 
updating (this corresponds to the “rectangle rule" of [Ch] and the "absence of side-effects” of 
{DB}). Using this information, the system translates (or rejects as untranslatable) the user’s 
updates, 

Translating under constant complement amounts to finding a database state s’ such that 
Ws'J=uws) and v(s')=v'(s). By the definition of a view complement, s’ will be unique if it exists 
at all. Thus, if such an ss’ can be found for any s (in which case we say that u is v- 
translatable), we can translate u as the database update T =X vy ux vy). The soundness of 
the overall approach is demonstrated by the following facts [BS]: 

i) T,, is consistent, i.e. the updated database always maps, under the view definition v, on 


the updated view (formally v7,,=uv ); also T,, is acceptable, meaning that if u does not change 
the view, no change is made on the database either (i.e. for all s, uws)= ws) implies T,(s)=s). 

ii) Suppose U is a set of view updates which is reasonable in the sense that it satisfies | 
minimal user requirements, i.e. it is closed under composition and there is a means to cancel the 
effect of every allowed update on the view (formally, if u. w €U then uw €U, and if s is a 
database state and u€U, there is an update w€U such that wuws)=Ws)). If v’ is a view 
complement such that any update in U is v-translatable, then the mapping which associates to 
an update u in U the database update 7, is a morphism, ie. T,,,=7,7Ty for all u, w €U 
(clearly, any reasonable way to translate a set of updates should have this property, i.e. the result 
of the translation should be the same whether the user applies two updates from the set one 
after the other or their composite update). On the other hand, the converse also holds: if T is a 
mapping on U such that, for every u€U, 7(u) is a consistent and acceptable database update, 
and also 7 is a morphism (i.e. T is a reasonable way to translate view updates into database 
updates), then there is a view complement v’ such that, for every uv€U, u is v’-translatable and 
T(w=T,,. 

However, as was pointed out earlier, this approach is essentially independent of any 
particular data model. In this work we investigate some of the issues and problems which arise 
when one attempts to apply this methodology in the context of the relational model, with a view 
towards rendering it realizable in practice. In the remaining part of the Introduction we review 
briefly some basic concepts and notations of the relational model, and we also give an overview 


of the results obtained. 


1.2. The Relational Model - Basic Definitions 


The relational model [Col] assumes that the data are stored in two-dimensional tables called 
relations. The columns of a table correspond to attributes, and the rows to tuples (records). Each 
attribute A has an associated domain of values D 4, and a tuple is viewed as a mapping from the 


attributes to their domains. We use the letters A, B, C, ... to denote attributes and the letters ... , 
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X, Y, Z to denote sets of attributes. A relation scheme is a finite set of attributes labeling the 
columns of a table, and is usually written as a string of attributes. If X is a relation scheme 
labeling the columns of a relation R, we say that R is defined over X. A database scheme D is a 
finite set of relation schemes. A database over D is a set of relations containing exactly one 
relation over each relation scheme in D. 

In the context of the relational model, one way of formulating queries (and thus of defining 
views also) is by using a set of operators defined on relations (relational algebra [Col, Co2]). The 
operators that will be of primary interest to us are projection and join. The projection of a tuple 
t to a set of attributes Y, written 4X], is simply the restriction of 1 to X; the projection of a 
relation R to X, written w y(R), is the set of projections of the tuples in R to X. If Rj, Rp are 
relations defined over the relation schemes Y;, X> respectively, the join of R, and Ro», written 
R*R> is the relation over X }UX consisting of all the tuples p such that p[X JER), p[X JER >. 

Semantic information is captured by means of integrity constraints (i.e. predicates on 
relations), usually expressed as first-order sentences called dependencies [Col]. Various kinds of 
dependencies have been defined and studied in the literature; we will be primarily concerned 
‘with functional dependencies and multivalued dependencies (the latter are a special case of join 
dependencies). 

A functional dependency (FD) [Ar, Col] is a statement of the form X¥-» Y, where both Y and 
Y are sets of attributes. The FD X-+Y holds in a relation R if for all tuples p and » of R, if 
p[X]=[X], then p[Y]=o[Y). 

A join dependency (JD) [R2] is a statement of the form *{R;, ..., R e where each R; is a 
relation scheme. The JD *{R;, ... , Rd holds in a relation R if * @ R{R= R. A multivalued 
dependency (MVD) [F, Zal] is a JD with at most two relation schemes. An MVD *[Rj, Ro] is 
also written as R;MR,~>—->R, (or equivalently R;NR>>—-R>). 

Finite sets of dependencies will be denoted by =. A database schema S is a pair (D, 2), 
where D is a database scheme. An instance (i.e. a database DB over D) which satisfies the 


dependencies in 2% is called Jega/ (notation: DBF=~). 


ll 


For a more detailed exposition of the fundamendal notions and notations of the relational 


model and of the relevant theory, see [U1]. 


1.3. Outline of the Thesis 


This work is mainly centered around the study of the computational problems arising when 
one attempts to apply the general methodology proposed by Bancilhon and Spyratos in the 
context of the relational model. We discover that very interesting theoretical questions already 
arise at very simple cases of the application. In particular, we concentrate on database schemas 
consisting of a single relation, with integrity constraints which are (for the most part) just 
functional dependencies. The views we consider are simply projections of the relation. Working 
with a single relation corresponds to some unrealistic universal relation assumption {U2}, but it 
yields a simplified problem which must be conquered first. Functional dependencies constitute a 
simple and practical class of constraints. Projective views are, again, the simplest imaginable, and 
they are also important from a practical point of view. 

In Section 2 we characterize when two projections are complements of each other. There is 
an interesting parallel between this characterization and the notion of independence of Rissanen 
[R1]. Our necessary and sufficient condition (which can be generalized to include the presence of 
join dependencies) states that the common part of the projections must be a superkey of one of 
the projections. As a consequence, it is easy to test whether two given projections are 
complementary in a schema. It is also possible to construct a nonredundant (minimal) 
complement of a given projection in polynomial time. Unfortunately, finding a smallest (i.e. with 
fewest attributes) complement of a given projection is shown to be NP-complete. 

In Section 3 we study how to implement the insertion of a tuple into a projection, keeping 
a given complementary projection unchanged. We show that this can be done in a unique way, 
and so the problem reduces to testing whether the resulting database is consistent. We show that 
this test can be carried out in time cubic in the number of tuples of the view. Since this is likely 


to be impractical, we also develop two alternative stronger tests that can be executed more 


2 


efficiently. 

Ideally, we would like the time complexity of our update algorithms to depend on the 
number of attributes, functional dependencies, and other parameters of the schema, not of the 
instance. When the time must depend on the number of tuples, we would at least like this 
dependence to be logarithmic, since this number is expected to be very large. However, 
complexities like those described in the previous paragraph resemble, in a practical sense, 
exponential complexities. We show some negative complexity results which suggest that this 
"exponential" behavior is inherent: The translatability problem becomes II hard [St] if the 
view is represented in some exponentially succinct way (e.g., as the union of two Cartesian 
products). Even one of the simpler, stronger tests mentioned above becomes co-NP-hard. 

Finally, we examine the complexity of finding a complement which renders a given 
insertion translatable. We show that this problem is polynomial in the number of tuples of the 
view, but inherently exponential in the size of the schema (and the Jogarithm of the number of 
tuples of the view). Similar results can be obtained for the two stronger tests. 

In Section 4 we extend these results to the case of deletions and replacements of tuples. We 
find that, for the most part, the extension is rather straightforward. Finally, in Section 5 we 
define and examine a new kind of functional dependencies which is important in the context of 
complements, the explicit functional dependencies. We extend our characterization of 
complementary projections to also allow for the presence of explicit functional dependencies. 


Section 6 concludes this work by pointing out some directions for further research. 
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2. Defining a Complement 


Let S be a database schema (U, 2), where U is a universal set of attributes and % is a finite 
set of dependencies. A relation R over U (instance of U) is called legal if it satisfies all the 
dependencies in = (notation: REZ). A view of S is for us a projection defined by a subset X of 
U. For each instance R, the corresponding instance of the view is 7y(R). We disambiguate 
updates on a view by defining a second view, Y, the complement of X. Two views X and Y are 
called complementary if @ y(R)=m y(R') and m y{R)=a y(R') imply R= R’, whenever R and R’ 
are both legal instances. In other words, the two views together contain enough information to 
reconstruct the whole database. 

When are two views X and Y complementary? Clearly, a sufficient condition is that the 
MVD *{X, Y] holds in every legal instance, i.e. Z implies the MVD *[X, Y]. If this is the case, 
the database can be reconstructed from its projections on Y and Y by join. Recently it has been 
shown [V1] that the condition is not necessary, i.e. if £ consists of general first-order sentences 
then wy and wy can be complementary without the reconstruction operator being the join. 


However, we show that this cannot happen if we impose more restrictions on 2: 


Theorem 1: 
Let Z consist of functional dependencies and join dependencies. Then X, Y are 


complementary iff Ze*[X, Y]. 


Proof: The “if" direction is immediate: if = implies the MVD *{X, Y], then for every legal 
instance R we have # (Ra y(R)= R. Consequently, if for two legal instances R, R' we have 
wy(R)=m y(R') and a p(R)=9 (RK), we get wy(RP a f(R=9 (Ra (R) and from this 
R=R’, ie. X, Y are complementary. 

For the "only if" direction, assume that £ does not imply the MVD *[X, Yj; we will show 
that X, Y are not complementary, by exhibiting two distinct legal instances R, R’ for which 
wy(R)=a (R) and wy(R)=9 YR). 
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Let o be a join dependency *[R yp,» RA; define M(o) to be the set of MVD’s 
{ *[ Ves, R;. Vies, Rj, S;, Spa partition of {1, ... , g} } (see also [MSY)). If X’ is the set 
we obtain if we replace each join dependency o in 2 by the multivalued dependencies in M(o), 
then, since o implies each MVD in M(o), = implies =’; but by our hypothesis Z does not imply 
*LX, Y], so =’ does not imply *[¥, Y] either. Now since =’ consists of FD’s and MVD’s only, 
there is a two-tuple counterexample to this implication [SDSF], i.e. there is a relation R 
consisting of two tuples » and » which satisfies all the dependencies in Z’ but does not satisfy 
*[X, YI. 

From the relation R construct another relation R' as follows: since R docs not satisfy 
*[X, Yj, it must be that p[X¥N Y]=v[XN Y], and also p[Y-X]#»[¥-X] and p[X-Y]*#v[X-Y]. Let R’ 
consist of a tuple »’ which agrees with » on X and with » on Y-X, and of a tuple pv’ which 
agrees with » on X and with p on Y-X. Clearly, R#R’, R’ satisfies all the dependencies in 2’ (it 
defines the same “special truth assignment" [SDSF] as R), and also #y(R)=1y(R') and 
a yR=a2 y(R?). Thus, we only need to show that R and R’ are both legal, i.e. they both satisfy 
all the JD’s in = (they obviously satisfy the FD’s in =, since these are included in =’ and R, R’ 
satisfy =’). 

Let “{R), Abs Rd be a JD in %; to show that it holds in R, it suffices to show that, if a 
tuple ¢ is obtained by joining ¢ /[R J], .... dR ql. where {), ..., € q are tuples of R, then either 
€=p or =». This is certainly true if €)=-~-=€g=h or if §,;=..=€ =»; else, let 
S,=tt € =p}, Sp={é €;=0}. Since the MVD *[ Ves, R;, Vies, Rj is in 2’, it holds 
in R, and thus either =p or =». Thus R satisfies all dependencies in 2, and so does R’ (by 


the same argument). This completes the proof. | 


Notice that our condition (though not the proof) parallels the result of Rissanen on 
independence [R1]. Intuitively, independence is stronger than complementarity, and thus our 
Theorem contains only the first condition of [Rl]. To see why, consider the classical Employee- 
Department-Manager scheme. The decomposition into X=ED, Y =EM is not independent, 
although Y and Y are complementary. 
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Theorem 1 has some algorithmic consequences: 


Corollary 1: 


Given (U, 2), X, YCU, whether X, Y are complementary can be tested in polynomial time. 


Prrof: By Theorem 1, testing for complementarity amounts to inferring an MVD from a set of 


FD’s and JD’s. The latter can be done in polynomial time [MSY, V2]. i 


Corollary 2: 
Given (U, =) and XCU, we can find in polynomial time a minimal (nonredundant) 


complement of X. 


Proof: Simply start with the trivial complement U and repeatedly take out any attribute in ¥ 
which can be taken out without affecting complementarity (examine the attributes in some 


arbitrary order). | 


Thus we can program in a database system some guidance to the user towards the 
definition of a complement. Unfortunately, as so often happens, finding the minimum is much 


harder. 


Theorem 2: 
Given (U, 2), XCU and 40, determining whether there is a complement Y of X with 
IYl=k is NP-complete. 


Proof: Membership in NP is obvious: just guess a subset Y of U with |Y|=k and verify 
(Corollary 1) that X, Y are complementary. 

To prove the hardness part, we will make a reduction from the 3-satisfiability problem 
(3-SAT), which is known to be NP-complete [Ck, K, GJ]. Let p be a Boolean formula in 3- 
conjunctive normal form (3-CNF); let x, i=1,...m, be the variables occuring in @, and let Sp 
j=l,...m, be the clauses of q. We construct the following schema Sg=U, z): U is 
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FyF XX 1--X,X ,A and = contains the functional dependencies F/...F,X;?X'p 
Fi). FigX 7X) i=l,...,.n, and also for each clause f= lyth it 13, j=1,..,.m, the functional 
dependencies Ljj>F;, L ig? F; L; yg F; Gf J; =x, Li=X; if ii= “TX Lj=X'p. 

Now let X be Fp..FjX pX p--XpX ps We na that XY has a complement Y with |YJ=l+” 
iff m is satisfiable. To see this, first assume that @ is satisfiable, and let A be a satisfying 
assignment. Take Y to be L)...L,A, where L;=X; if h(x) is true, L,;=X’; if h(x) is false. To 
show that Y, Y are complementary, it suffices to show (by Theorem 1) that ZF*[X, Y]; to do 
that, we use the chase method for inferring dependencies [MMS]: if we consider the ‘ableau 
consisting of a row with distinguished variables in the X columns and a row with distinguished 
variables in the Y columns, then we can convert the second row into a row of distinguished 
variables by using the F'D-rules corresponding to the FD’s in = as follows: first, since h satisfies 
ap at least one of the FD’s {L; 17 Fp Lia Fj Li3 Fi can be used to fill in Fi and this can 
be done for all j. Then the es FF yy Xj7X pF p-FyyX 7X; can be used to fill in the 
remaining X;’s and X’/s. 

For the converse, suppose there is a complement Y of X with |Y|=1+n. Clearly Y has to 
contain at least one of {X;, X"j} (else there is no way to fill in both X; and X’), and thus Y 
contains exactly one of {X, XxX i for each i (also A€ Y). Consider now the assignment h, where 
h(x) is true if X,€Y and false if xX EY: since F; is filled in, at least one of {L ‘ib Lin Li3k must 


J 


be contained in Y, and thus A satisfies f, This is true for all j, so h satisfies p and the claim is 


fj 
established. 
Finally, it is easy to see that So and X can be constructed in time polynomial in the length 


of qm. This completes the proof. | 


Observe that in our reduction we only used FD’s, so Theorem 2 is true even if % is 
constrained to contain only FD’s. Now if X'={Z->—>B | ZB is an FD in 3}, then if o is a 
JD Xo iff X'~o [BV]. Thus, we might as well replace = by =’ in our proof, which means 


that Theorem 2 is true even if = is constrained to consist of MVD’s only. 
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3. The Translation of Insertions 


3.1. Testing Translatability 


= is now a set of functional dependencies: we furthermore assume that each FD in & is of 
the form XY->A, where A is a single attribute (this is easy to enforce, by replacing each FD 
X-—Y in = by the equivalent set of FD’s {X->A: A€Y}). 

Suppose that the view X and its complement Y are given, and so is the current instance V 
of the view. We wish to translate the update u on the view consisting of the insertion of a tuple 
t, while keeping the complement m (R) constant. How can we design an update on R, T7,, 
which achieves this? 

The translation 7, should have certain obvious properties: 

-A. It should implement the view update, that is 7)(7,[R)J=VU1. 

B. It should keep the complement constant, according to the prescribed semantics; that is, 
a (7, [RD= 0 (R). 

C. It should yield a consistent database, that is, if R is a “possible” instance, T AR\F 2. The 
meaning of “possible” is the subject of ssopeity D below. 

D. A more subtle but important assumption is that the user proposes the update based on his 
knowledge of the view and on no other information concerning the database. Thus, the 
translation should produce a legal database for all legal instances of the overall database, given 


the instance of the view. 


It is quite interesting that these properties determine precisely when the insertion of a tuple 
t in an instance V of the view is translatable, and, if it is, the translation 7, is unique. 

First, suppose that /¢ V (otherwise T,, is the identity). Since w y(R) must be kept constant 
(Property B) we must assume that (XN YIEw yn (R= 2 yj V); otherwise, the only way to 
insert (in 7 y(R) (Property A) would be to insert something in # y(R). By Theorem 1, YN is a 
superkey of either X or Y. If it is a superkey of X, then the update is clearly untranslatable, 
because VU1 is not the projection of a legal instance (Property C). So XN Y-—~ Y. It follows that 
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the only 7), satisfying A, B and C is the insertion of the tuple ta y(R) in the database R: 
T{R]=R U tay(R) (* denotes the natural join). 

It remains to determine under which conditions 7,{R] is legal (Property C). The insertion of 
t into V is translatable iff T,[RJF= for all R such that REX, 1 y(R)=V (Property D was used 
here). 

Suppose that the insertion is not translatable. This means that there is a functional 
dependency, say Z-> A, which is violated by T,{R) for some R for which REZ and aw (R= V. 
Since R satisfies Z-> A, the inserted tuple must be the culprit. Thus, there must be a tuple r of 
V which agrees with ¢ on ZMX and, if ACX, disagrees with t on A. Furthermore, if we fill the 
rows of V with new symbols in the columns of Y-X, only with {ZN(Y-X)]=p[ZN(¥-X)] where 
p is a tuple agreeing with ¢ on XNMY (call this relation R(V, 4 r, Z—A)) and then perform the 
chase [MMS] wrt Z on this relation, no two distinct elements of V, neither the elements 
corresponding to 7A], plA] (if AE Y-X), are ever equated (if they are, we say the chase 


succeeded). It turns out that this is a necessary and sufficient condition for untranslatability: 


Theorem 3: 
The insertion of ¢ into V (*€V) is translatable as ReR U fw y(R) if and only if 
(a) AXNYJEw yn fV). , 
(b) 2 implies XNY->Y, and ZX does not imply XNY—YX. 
(c) Chasey[R(V, 4 +, | succeeds for all functional dependencies f€Z and tuples r of V. 


Proof: By the preceding discussion, all we need to notice is that, if Chases[R(V, 4 1, f)\ does not 
succeed for some FD f€Z and some tuple r of V, then it actually provides us with a 
counterexample, i.e. it constructs a relation R such that REZ, 1 y(R)=V, and 7,JR] violates f 
In the opposite case, the chase actually provides us with a proof that there can be no relation R- 
such that REZ, 7y(R)=V, and 7,[R] violates some f€X, i.e. T/[RJ=X for all R such that 
REZ, wy(R)=V. i 
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Corollary: 
Whether an insertion is translatable can be tested in time O¢ V3 fog|V| [7 |Y-Xp. 


Proof: Clearly, condition (a) can be tested in time O(|V)), and condition (b) can be tested in 
time O(\Z[) (using the linear-time algorithm [BB] for inferring an FD from a set of FD’s). Since 
condition (c) can. be tested by doing O({Z| |V)) chases, it suffices to show that the 
chase of R(V, 4 7, f) can be computed in time OAVI7log|V| j=] | Y-X). Recall that the chase 
procedure consists in repeatedly locating a pair of tuples p, » such that p[Z]=»[Z] and 
p[A]*v[4] for some FD Z— A in &, and replacing the element p{A] with v[A] throughout the A 


column. This can be done by the following straightforward algorithm: 


Initialize R” to be RIV, 4 7, Pp. 
Repeat until no new change is made on R’: 
For each FD Z->A in 2 do: 
Sort R” lexicographically according to the elements of the Z columns. 
Find the first pair of consecutive tuples p, » such that p[Z]=»[Z], p[A]*[ A]. 
Replace p[A] by »[A] throughout the A column. 


It is clear that each execution of the body of the for loop takes time O(|V[log|V)), so each 
execution of the for loop takes time O(V|log|V| |Z. Since each time the for loop is executed 
the number of distinct symbols in the Y-X columns is reduced by at least one (if the chase ever 
attempts to equate two different elements in one of the X columns we stop immediately), and 
initially we have | Y-X]|V| such symbols, the for loop will be executed at most |Y-X||V| times, and 
so the total time is at most O(\MIlog\V| |Z| 1¥-XD. a 


The algorithm described above can be speeded up by taking the following straightforward 
shortcut: to construct R(V, 4 r, Z— A), first fill the rows of V with new symbols in the columns 
of Y-X, then do a chase (and store the resulting relation to be re-used for other members of 2), 
and then set AZN(¥-X)]=p[ZN(¥-X)], for some p» agreeing with ¢ on XN Y. However, since we 


are still unable to provide a better quarantee for its worst-case performance than O(| Vir iogl Vy, 


its applicability in practice is dubious, in view of the fact that |V| is normally very large. For this 
reason, we will now present two alternative tests for which we can show better upper bounds to 
their worst-case performance. However, our tests will be stronger than necessary, i.e. in addition 


to rejecting all untranslatable insertions, they may also reject some translatable ones. 


Test 1 


Our first alternative test consists in simply avoiding to do a full chase on R(V, 4 7, ZA); 
instead, for each tuple mw agreeing with ¢ on XM Y, we do a chase on the two-tuple relation 
consisting of r and yt, and we report success if any of these chases equates r{ A], p[ A] (if AE Y-X; 
notice that in this case p[A]= 4 A], since YN Y— Y), or attempts to equate two distinct elements 
of V. Thus, what we are actually doing is imposing the extra requirement that 
Chases[R(V, 4 1, J\ succeeds fast, if it succeeds at all. Intuitively, this does not seem to be very 
restrictive, and one may hope that Test 1 will actually accept most of the translatable insertions 
that will occur in practice. 

The test can obviously be implemented in time odvi2 ||). However, we can do better (in 


terms of the dependence of the time complexity on |V), as follows: 


1. Fill the rows of V with new symbols in the Y-X columns. Then determine the set of tuples 
T={p: p[LXNYJ=qXNY]}. This can be done in time O(V). 

2. For each ZCU, construct a copy of the relation 7 (call it 77), and sort it according to the 
contents of the Z columns. This can be done in time oQlUl |V\log| VD. 

3. For each ZCU, compute the closure of Z under &, ie. the set Zt ={A: TE Z— A}. This 
can be done in time ollUl |=) (using the algorithm of [BB] for computing closures). 
4. For each ZCU, go through the table 7-7 from top to bottom and, whenever a tuple agrees 
with the previous one on Z, make it agree on Z*. This has the effect of making all tuples 
which agree on Z to agree on Z* (as they should), and it can be done in time ol |W. 
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5. For each FD Z->A in = do: 
For each tuple r for which {ZN XJ=4ZNX] and x A]#q A] (if AEX), do: 
Make r agree with » on ZM(Y-X), where p is a tuple in T. 
For each ZCU do: 
Insert r in Tz. 
If 4Z]=»[Z], where v is either the tuple next to r or the tuple before r in 77, 


then make r agree with » on ZT. 
This can be done in time O(|Z] [VJ 21 log|V)). 


Thus, the overall time expended is O(Vlog|V| 1 j=). Of course, there are various 
optimizations and shortcuts one may employ in an actual implementation (for example, to 
handle the potential problem of having too many sorted tables - say by actually having for each 
Z a sequence of pointers to the tuples of 7). Observe that the running time of this algorithm will 
be better than our worst-case upper bound for the exact translatability test (and also better than 
the obvious O4 V2 |Z) algorithm) if |V|/log|V| > U1, which is definitely going to be the case in 


practical situations. 


Test 2 


Our second alternative test has a somewhat different flavor: notice that Test 1 saves time by 
doing only part of the computation necessary for each particular chase. Test 2, instead, will only 
do one full chase, if this is possible. 


More specifically, recall that the essential part of the translatability test (in terms of time 
requirements) is checking if for all R such that REZ, 7 y(R)=V, we have TAR\= x. Suppose 
now that Y actually has the following property: 

For all R, Rz such that RFX, REZ, wyR)=7y(R)), 
AXNYEm yn WRV=a yj Ry), we have that TR JF x iff T[RIF =. 

We call such a Y a good complement of X. Our interest in good complements lies in the 
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fact that, if Y happens to be a good complement of X, then clearly all we need to do to test if 
the insertion u is translatable is consruct some relation Rp such that Ro, 7 y(Ro)=V, and 
test if T ARIE 2. We can construct such an Rp by filling the rows of V with new symbols in 
the Y-¥ columns and then doing a chase; this can be done in time Of(| Vi7log\V| [=| | ¥-XD. 
Testing if TJRgl=Rg U fa y(Rp) satisfies Z amounts to testing if for each tuple p of Rg, the 
two-tuple relation consisting of p and fa (Rv) satisfies all the FD’s in 2; this can be done in 


time O(V| |=). 


Thus, all we need to do is show how one can test if a given complement Y of X is actually 
a good complement. Observe that this property is independent of the tuple / to be inserted, i.e. 


it is a property of the schema only (X, Y and 2%). 


Suppose, then, that Y is not a good complement of X. This means that there are two 
relations Rj, Ry such that R/F=X, R=, w (R= a (Ry), AXOVEw yp WR P=a yp ylRy), 
TAR =X and T[R)J=R; VU fay(R}1) violates some FD in 2, say Z-> A. Since R |, 
there must be a tuple p, in R, such that p)[ZJ=a7[t*a (RJ, wlAl¥a lea y(R 7). Also 
there must be a tuple v, in R, such that » [XN Yj=qXNY]. 

Since 1 (R ))=m (Ry), we can then find two tuples 4, » 7 of R» such that pX)=p 1%) 
v4X]=» |X]. Now consider the relations R';, R 2 consisting of py, v, and of py 2 
respectively. Clearly, R'jZ, R'x- 2 (since 2 only containd FD’s), wy(R' J=7 xR 9, 
AXNYEw yj AR p=ayy fF’ y), TAR RE and 7,{R' )] violates ZA. 

Thus, Y is not a good complement of ¥ iff there are two relations R’;, R'y with at most 


two tuples each which witness this fact. 


From the above observation, we can easily see that we can test if Y is a good complement 


of X by doing the following for each FD Z->A in 2: 
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Initialize T; to be a relation with three tuples pj, v p ty as follows: 
t[W]=ag for each W in U, 
v [Wl=ap for W in Y, a, for W in X-Y, 
pWl=a,y for W in Z, ay for W in U-Z. 

Initialize 7) to be a relation with three tuples p>, v2, ft as follows: 
tg=t,, ¥2=v,, wbAW|=ap for each W in U. 

Repeat until no new change is made on either 7, or T>: 
Compute the chase wrt 2 of p,;, v, (in this and all subsequent chases, to equate 
a; and a, i < j, replace a; by aj. 
BAM HAL » AX e» 1X1 
Compute the chase wrt = of B, #2 of vx by and of By ty 
BAX] «psy, v [X]~» AX. 


When the above procedure terminates, we check if pA]=7,[A]. If not, then 7), T 
constitute a counterexample to the goodness of Y; if it turns out that p [4]=7¢,[A] for each FD 
Z— A in X, then we have actually proved that no such counterexample with at most two tuples 
in each relation can exist, and so Y is a good complement of X. 

Since each execution of the repeat loop can be done in time O({Z|), and each time we lose 
at least one out of O(U) symbols, the running time of the algorithm is Odz|? |UD (the 


procedure is repeated at most || times). 


Notice that, if Y happens to be a good complement of X, then actually Test 2 accepts 
precisely the translatable insertions, whereas in the opposite case it rejects all insertions. 
However, testing if Y is a good complement of X can be done once and for all at the time Y is 
declared as the complement to use, and if it is found to fail then the database system can simply 


disregard Test 2. 
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3.2. Complexity of Testing Translatability 


So far, we have shown how one can test if a proposed insertion of a tuple into a view is 
translatable, and if so, how to do the translation (Theorem 3). We presented an O(| VP ioglWD 
algorithm for testing translatability. Since this algorithm is likely to be inefficient in practice, we 
also developed two alternative stronger tests, which can be executed faster. 

In the sequel, we are going to prove a result which has some negative implications 
regarding the extent to which one can hope to improve the running time required to test 
translatability. Specifically, we will show that, if the view is presented in an exponentially 
succinct way (i.e. as a union of Cartesian products) then testing translatability becomes IT -hard 
[St]. This result provides strong evidence against the possibility of having an algorithm that runs 
in time less than O(|V)), i.e. it indicates that the whole view has to be examined in order to test 
translatability. 

Moreover, we believe that this result also casts some doubt on the possibility of 
substantially improving the running time of our algorithm. Loosely speaking, IT f-hardness 
seems to indicate that the problem lacks a "nice" combinatorial structure, which could be 
exploited to yield an algorithm considerably more efficient than the one resulting from our 
(more or less) straightforward approach. 


We will now prove the result: 


Theorem 4; 
Determining if an insertion is translatable is II ,?-hard if the view V is given implicitly as 


the union of two Cartesian products, of total size OU). 


Proof: Let G be a Boolean formula in 3-CNF, containing the variables x, i=1,...n, and 
consisting of clauses fp j=1,....m, and let X={x), .., xy}, Y={xp4 bows x,} be a given 
partition of the set of variables of G. It is known [St, W] that determining if for all possible 
assignments of truth values to the variables in X G is satisfiable, ie. if VX3Y G(X, YJ=1 (where 


VX means Vx;..Wx, etc.) is TI P-complete. In what follows we give a polynomial-time 
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reduction from this problem to the problem of testing translatability of an insertion to a 
succinctly presented view. 
Let U be BX)X')...X nt nAF pF C, and let Z consist of the FD's X [x pnX 5X KA 


FFG, BA-—C, and, for each clause Shit iat 43 of G, the FD’s L,jA>F, LAF, 


Lj3A>F; (where Li is X, if is —x,). Let the view be 


BX /X' }...X,X jp and let the complementary view be XX’ )...X,X" ,AF}..F,,C. Finally, let the 


Li is x, and Lj; is Xx’, if Li 
instance V of the view be SBXSy, X Xo Xx gi U s, where sy 2 is a relation over X;X’; 
consisting of two tuples p, v; with p{XJ=0, p{X’ J=1, »{XJ=1, vy {X'A=0, sp is a single tuple 
over B with sp{B]=5, and s is a single tuple over BX /X';..X,X', with {Bj=a, {XJ=1, 
sf X’ jl=1. Observe that V is essentially just a list of all possible truth assignments: each tuple p 
of V, with the exception of s, defines an assignment fA: {x fe aes x7 {0, ]} by taking 
Ax)=WX) GX J=7ylX); also, p[B]= 6. 

Suppose now we want to insert in V the tuple 4 where ABl=b, 
AX )X' 1..X)X l= IX )X' }-X,X y| We will show that this insertion is translatable iff 
VXAY G(X, Y)=1. First, it is obvious that conditions (a) and (b) of Theorem 3 are satisfied. 
Furthermore, observing that the only tuple agreeing with ¢ on X;X’ ;...X,X",, (the common part 
of the view and the complement) is s, it is easy to see that condition (c) is satisfied if the FD fis 
taken to be X;X")...X,X' ;-> A (because the only tuple agreeing with ¢ on X)X’ ;..X,X', is 5), or 
if fis F}...F,,—>C (since no attribute of fis in the view), or if fis LijAF; (since s agrees with ¢ 
on all possible Lis). 

Thus, all we have to show is that, for all tuples r with r#s (these are the tuples agreeing 
with ¢ on B), Chasey[R(V, 4 7, BA->C)} succeeds (i.e. starting with r{A]=s[4] we eventually 
equate AC], {C)) iff there is a satisfying assignment A for G which agrees with the one defined 
by ron {x .. , xz}. 

First, suppose there is such an assignment h, and let r, be the tuple corresponding to it. 
Since JX pX'7..X,X pl=AXX p-X,X yh rAAl=AAL so r_[A]=SA]. Since A satisfies Sp 
rl a=) for some i, so AL =AL jd i.e. TALL Al= 41:4), and so AF A=4F A, for j=1,...,m. 
Thus, rp[F}...F =F ...F_), 80 rylC]=4C]. But since r,{BA]= ABA], r[C]=AC], and thus 


AC]=SC], ic. Chases[R(V, 4 1, BA—C)] succeeds. Conversely, it is not difficult to see (by 
essentially tracing the previous argument backwards) that {C], {C] can only be equated if there 
is a tuple corresponding to a satisfying assignment and agreeing with r on X (x pod 4X k 

Thus, we have established that the insertion of ¢ into V is translatable iff 
VX4Y G(X, Y)=1. Since U, Z, t and the description of V as a Cartesian product can obviously 


be constructed from G, X, Y in polynomial time, we are done. | 


It certainly is not surprising that using a similar (only simpler) construction we can show an 


analogous result for Test 1: 


Theorem 5: 
Determining if Test 1 accepts an insertion is co-NP-complete if the view V is given 


implicitly as the union of two Cartesian products, of total size OQUD. 


Proof: We first show membership in co-NP: if X denotes the view and Y the complement, then 
the following is a non-deterministic polynomial time algorithm to determine if Test 1 does not 
accept the insertion of the tuple ( into V: guess an FD Z— A in % and two tuples +, p over X, 
and verify that rEV, pEV, AZNXJ=qZNX] (and AAl#AAl, if AEX), pB[XNY=A4XN VI; 
construct a relation R consisting of two tuples 7’, p’, with r[X]=7, p[X]=p, and with new 
symbols in the Y-X columns, only with r[ZN(Y-X)]=n'[ZN(¥-X)}; compute the chase wrt Z of 
R, and verify that it does not attempt to equate two distinct elements of r, p, and, if A€ Y-X, it 
does not equate HA}, pA]. 

To prove the hardness part, we will make a reduction from unsatisfiability of Boolean 
formulas in 3-CNF. Let G be such a formula, with variables x, i=1,...,2, and clauses Fp 
j=1.,....m. Let U be BX 1X pe*, eg nc and let % consist of the FD's B->C, and, for each clause 
i= lyt+Tigt Ig of G, the FD LjLj2 gC, let the view be BX )X';...X,X', the complement 
be X )X";...X,X",C, and the instance V of the view be spXsy yt X—-XSy X U s, where sp, 
Sy X', are as in the Proof of Theorem 4, and gJ=a, {XJ=0, §X’J=0. 


Suppose now we want to insert in V the tuple ¢ where 4(B|=6, 
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{XX 7..X.X pl= IX )X 1X ,X ,]: we claim that this insertion is accepted by Test 1 iff G is 
unsatisfiable. To see this, observe that the only tuple of V agreeing with ¢ on X [x pe, 1X " (the 
common part of the view and the complement) is s; by a reasoning similar to that in the Proof 
of Theorem 4, we see that the only FD which needs to be checked is B->C. Now if ré#s, the 
chase on rs, s will equate ACh, SC] iff for some j, AL pL jah j3l= IL Lah js) i.e. AL; J=0, which 
means that the assignment corresponding to r does not satisfy i Since this should happen for all 
tuples 7, the claim is established. 

Since U, X, ¢ and the description of V as a Cartesian product can be constructed from G in 


polynomial time, the proof is complete. | 


3.3. Finding a Complement 


So far, we have assumed the following scenario for translating view updates: when the user 
updates a view, he also specifies unambiguously the semantics of the update by defining a 
complement which should be kept constant during the translation. We studied in detail the 
problem of checking if a proposed insertion of a tuple into a projective view is translatable, 
when the complement is another projection and the database consists of a single relation 
satisfying a given set of functional dependencies. 

However, a real database system should also be able to provide the user with some 
assistance concerning the task of defining a complement. We already gave a glimpse at this 
problem in Section 2, where, after we characterized complementary views, we examined the 
problems of finding a nonredundant complement and a minimum complement. Now that we 
have also gained some understanding of testing translatability, we can pose the following 
question: Suppose the user wishes to have the update translatable, imposing only partial (or 
none at all) restriction on the complement to be used. How can one determine a complement 


which will render the update translatable? 


Let the view be X, and suppose Y is a complement of X such that the insertion of the tuple 
t into the instance V of the view is translatable under constant Y. Clearly, Y= WU(U-X), where 
WEX. Since {W|Ea XV) (condition (a) of Theorem 3), there is a tuple r of V such that 
A W\=4W). Consider now the set of attributes W,={A: AEX, r[A]=q A]}. It is immediate that 
4{WCa wf V), 2 W,->Y (since ZW- ¥ by condition (b) of Theorem 3, and W,2W), and 
2 does not imply W,--X (if Z=W,-+X, then the insertion of ¢ into V is not translatable since 
4WA=W,], t#r); moreover, if R is a database such that RFX, wy(R)=V, then 
fa w (R= fa yfAR), and thus since R U fm» (RJ, it follows that also R U fn W (Rez, 
for all such R. Therefore, the insertion of ¢ into V is also translatable under constant 
Y= WAU(U-X). . 
From the above discussion, it is easy to see the following: 


29 


Theorem 6: 
Given 2, X, V and ¢, we can find a complement Y of X such that the insertion of ¢ into V 


is translatable under constant Y within min(\V\, lx) tests of translatability. 


Proof: One can compute, for each tuple r of V, the set W,={A: AEX, fA]=4A]}, and, after 
eliminating duplications, test, for each such W,, if the insertion of ¢ into V is translatable under 
constant Y= W,U(U-X). If no such W, is found, then, by the preceding discussion, there is no 


complement Y of X such that the insertion of ¢ into V is translatable under constant Y. & 


Thus, we can determine if there is a complement which renders a given insertion 
translatable in polynomial time (see the Corollary to Theorem 3). Observe, however, that the 
polynomial complexity depends strongly on the fact that we are allowing the whole view V as 
part of the problem instance. The following result indicates that there is an inherent exponential 
dependence on |U|+/og|V|; in other words, we may nevertheless have to check all possible 


subsets of X in order to find a complement. 


Theorem 7; . 
Determining if there is a complement Y of X such that the insertion of ¢ into V is 


translatable under constant Y is NP-hard if the view V is presented succinctly (as in Theorem 4). 


Proof: Let G be a Boolean formula in 3-CNF, containing the variables x, i=1,...n, and 
consisting of clauses ie j=1,...m; assume furthermore with no loss of generality that the 
variables appearing in each clause are distinct. Let U be ¥ [Xp X XP p-Fyp and, for each 
clause Gee gt iat lig of G, let 2 contain the FD’s Lj Fp Lig Fp Lig? F; Let the view X 
be XX" ;...X,,X",» and the instance V of the view be SY LX PR ® SKK a where sxx; is as in 
the Proof of Theorem 4. 

Suppose now we want to insert in V the tuple 4, where 4¥J=4X'J=1. We will show that 
there is a complement Y= WUF’...F,, (WCX) such that the insertion of 1 into V is translatable 
under constant Y, iff G is satisfiable. First, it is easy to see that, since 4] should be in w y(V), 


W should contain at most one of the attributes Y rs x ; for each i; furthermore, since we should 
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have that Z=W-F)...F,, (clearly & does not imply W->X), it is not difficult to see that W 
should define a satisfying assignment for G. 

To complete the proof, observe that, if R is a database such that RE 2, a x(RJ= V, then for 
any two tuples p, » of R, HLF A=olF A. This happens because, if MILL] and 
mL jl*o[L 2] (in the opposite case obviously p[F d= F, ‘D). then there is a tuple in R such 
that aL j= IL), EL jal= IL] (because the variables in fj are distinct); thus, SO alF. 
glFA=oLF), and therefore HF A=vLF). It follows that, for the insertion of 4 into V to be 
translatable under constant Y, it suffices to have AW\€ay(V) and Z=W>F)...F, 

Finally, U, =, X, t and the description of V as a Cartesian product can be constructed from 


G in polynomial time, and we are done. | 


We remark that, by following a similar line of reasoning, one can see that Theorem 6 
remains true if we interpret “translatable” as “accepted by Test 1 (Test 2)", and “test of 
translatability" as “Test 1 (Test 2)". The same holds for Theorem 7 (the reductions, although 
somewhat subtler, are based on the same idea as the reduction given in the Proof of Theorem 7, 


and are therefore omitted). 
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4. The Translation of Deletions and Replacements 


In this Section we briefly show how the ideas developed previously for the case of 
translating the insertion of a tuple to a view can be adapted in a straightforward manner to 
handle the case of deleting a tuple and of replacing a tuple with another. We continue to assume 
that 2 is a set of FD’s satisfied by the database R, and that we are given the view X, the 


complement Y and the current instance V of the view. 


4.1. Deletions 


Suppose we wish to translate the update u on the view consisting of the deletion of a tuple 
t, (EV, while keeping the complement a y(R) constant. The update 7, on R which achieves this 
should satisfy my(7JRV=V-t, 1 AT [RV=a y(R), and also T,{RJF2 for all R such that 
REZ, 2 y(R)=V (compare with Properties A through D given for the case of an insertion). 

Now since w y(R) must be kept constant, we must have that (XN YJ€a yn pWV-y); in other 
words, there is a tuple r€V such that r#4, {XN Y|=4XNY]. From this we now see that YOY 
cannot be a superkey of X (since V is a projection of a legal instance), so by Theorem 1, 
XN Y—Y. It follows that the only possible candidate for T,, is the deletion of the tuple a yR) 
from the database R: T,{R]=R-f*w {R). 

But now observe that, since 7,{R}CR and = only containds FD's, T [RIE = if REZ. Thus, 
our last requirement that 7,[R]2 for all R such that REZ, R= V, is satisfied trivially. | 


We have thus shown the following: 
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Theorem 8: 

The deletion of 4 from V is translatable as R«R-i*a VR) if and only if 
(a) (XN YEn yp fV-v. 
(b) Z implies XNY—Y, and = does not imply XN Y—X. | 


Hence, determining if a deletion is translatable can be done in time O(V|+|Z)p. 


4.2. Replacements 


Suppose now the update we wish to translate under constant complement Y is the 
replacement of a tuple tj, (;€V, by a tuple J, to€ V. The update 7, on R should satisfy 
a y(7 ARV=V-t; U ty and again a T {RY= (R), and T {Rez for all R such that RE=%, 
my(R)=V. We distinguish two cases: 


Case 1 1 [XN YJ# {XN ¥]. 

This case exhibits a behavior similar to the one we are already familiar with: specifically, 
since mR) must be kept constant, we must have 1 [XN YJ€a yn fV-tp), (AXNYEryn WV). 
From this it follows that XN Y cannot be a superkey of X, and thus it is a superkey of Y by 
Theorem 1, Hence, the only possible candidate for T y is the replacement of the tuple 1)" y(R) 
by the ‘tuple 17*9 VR). 

To check now if the last condition is satisfied, i.e. if T {Rx for all R such that RF=Z, 
my(R)=V, it is not difficult to see (by a reasoning exactly analogous to the one given for 
insertion) that all we have to do is check if Chases[R(V, 1 1, J] succeeds for all FD’s fin 2 
and for all tuples r in V which are different from 1). 


Case_2 1{XNY=t{XNY]. 

In this case we see that the first two conditions can be satisfied with no further restrictions 
on V, 4 X, or Y, and moreover the only possible candidate for 7,, is replacing the set of tuples 
t;*m y(R) by the set of tuples t7*w y(R) (we can no longer assert as before that either set will 
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consist of a single tuple, since this depended on XM Y being a superkey of Y, which is no longer 
necessary). 

Checking whether the last condition is satisfied, ie. whether 7,[R]F= 2 for all R such that 
REZ, 7 y(R)= V, can still be done by checking if Chasey[R(V, t4 17, J] succeeds for all fin 2 
and for all r in V, r#1; (one can see that the fact that ¢)*# )(R) and 1)*m y({R) may consist of 
more than one tuple does not affect anything). 


Thus, we have the following: 


Theorem 9: 
The replacement of {¢, by t in V_ (t,€¥, 1o¢ V) is translatable 
as ReR-t)*9 R) U t)*2 A(R) if and only if 
(a) @[XNYJEayn hV-ty) and UXNY Eryn pV», or 1AXNY=1fXNY). 
(b) 2 implies X¥NY—Y and Z does not imply XNY—X, or ¢[XNYJ=11XN YI. 
(c) Chases[R(V, tz 1, | succeeds for all f in & and for all r in V, rét). | 


From Theorem 9 it should be clear that one can develop results analogous to the ones 
given for the case of insertion in a straightforward way. Thus, we will not pursue this direction 


any further. 
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5. Explicit Functional Dependencies 


Functional dependencies assert that a certain mapping is one-to-one -for example, a 
mapping from employee-project pairs to managers, or from cost-price pairs to rates of profit. 
However, there is a difference; certain such mappings are essential information stored by the 
database (as in the first example above), whereas others are redundant information, mappings 
that could be computed explicitly (as in the second example). We call the latter case of FD’s 
explicit FD’s (EFD’s). 

EFD’s are important in the context of views and view complements, because they can 
seriously affect the information content of database mappings. We thus felt that we should study 
their behavior vis-a-vis the other known classes of dependencies. We first define formally what 


an EFD is: 


Definition: 
A set of attributes X explicitly determines a set of attributes Y (notation: X->, Y) if there is 
an instance- independent function f (called a witness of XY, Y) such that # y(R)=f4 YR), for 


any legal instance R of the database. 
Examples. Cost-Profitrate, Price, Course-Student-Grade>, Average-Grade. 


We remark that, in our definition of an EFD, no special property of the witness function f 
is assumed. This leads naturally to the following extension of the meaning of implication of an 
EFD o from a set of dependencies 2, where o,, i=1....,k, are the EFD’s in =: for all functions 
Ff, t=1,...k, there is a function f such that, if a database R satisfies all dependencies in 2 (where 
J; is taken as the witness of o,), then it also satisfies o (where fis taken as the witness of o). 
In case o is not an EFD, then one just omits the requirement of the existence of f 

As we are going to see shortly, with this approach EF D's behave very much like FD’s (in 
the sense of Propositions 1 and 2). It would be interesting to see what happens if one imposes 


natural restrictions on the witness function f such as invertibility, 0-1-valuedness, etc. 
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In the following, if Z is a set of dependencies, we denote by Zp the set of 


FD’s {X—Y: X-, Y is in Z}. 


e 


Proposition 1: 


Let = be a set of EFD’s; then ZEX—, Y iff Zp XY. 


e 


Proof: Consider the following chase procedure for computing X +. initialize Xt to X; 
repeatedly locate a member Z—>B of 2 such that ZC XY + and B is not contained in ¥*+, and 
set Xt to XTUB. As is well known [MMS], this procedure terminates with a unique X + and 
furthermore Z=X—7Y iff YCX*. 

We will now argue that also ZX, Y iff YCXT. First, if YCX*, then it is clear that 
2EX->, Y by the construction of X + (observe that, if ¥— é Y and Y> ~ Z, then X>, Z). 
Conversely, if Y is not contained in X + we will show that £ does not imply X te Y. For each 
EFD Z-, B in &, pick as its witness a function f7_, p such that f7_, p(t7)=17p, where 17 is 
a tuple over Z with t7{[W\=a for all W, and t7p is a tuple over ZB with 17 p[W]=a for all W. 
Now if g is a purported witness of X¥-+, Y, then consider the database R consisting of a single 
tuple ¢ with {W)=a for WEXT, and 4{W\=y otherwise, where y#a 4(g(4X)), for some A in 
Y-X. It is clear that R satisfies each EFD Z->, B in & (with witness f7_, p), but R does not 


satisfy X-—>, Y with witness g. | 


e 


Proposition 2: 

Let Z be a set of EFD’s, and let &' be a set of FD’s and JD's. Suppose that LUZ'’Feo: 
(a) If o is an FD or JD or embedded JD, then 2/UL'o. 
(b) If o is an EFD, then ZLEo. 


Proof: 

(a) If z puz does not imply o, then there is a relation R which satisfies = puz but violates 
a. Now since RF=2yj, clearly we can pick a function f; for each EFD o; in = such that R also 
satisfies o; with witness f;, Thus, R satisfies ZUZ’, and therefore ZUZX' does not imply o. 


(b) Assume that 2 does not imply o, and observe that the one-tuple relation R constructed in 
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the Proof of Proposition 1 also satisfies =’ (since it satisfies any FD, JD or embedded JD. Thus, 


R satisfies XU’ and violates o, and so ZUZ’ does not imply o. | 


Thus, we can easily augment any of the known axiom systems for F'D’s, FD’s and MVD’s 
etc. to include EFD’s. Moreover, our characterization of complementary views (Theorem 1) can 


be extended to include EFD’s as follows: 


Theorem 10: 
Let = be a set of FD’s, JD’s and EFD’s. Then X, Y are complementary iff: 
(a) They are complementary when considered as views of my,y)(R) (ic., Z implies the 
embedded MVD XN Y->X-YIV-X): and 
(b) ZEXUY-, U. 


Proof: The "if" direction is immediate: from (a) 7 \(R}*2 (R)=a xU y(R) for every legal 
database R, and then from (b) R=f('7 yj WHRV=Ka (Ra y(R)), where f is an instance- 
independent function. Thus, if for two legal instances R, R’ we have 7)(R)=a RD) and 
awy(R=m (R'), we get R=fry(RPafR=Aa (Rap )=R, ie. X, Y are 
complementary. . 

For the “only if" direction, assume first that (a) is false, ie. & does not imply the 
embedded MVD XN Y->—>X-Y|Y-X . We first remark that the Equivalence Theorem of [SDSF] 
is also true if o is an embedded MVD (using the partial extension of the equivalence between 
dependencies and formulas to include embedded MVD’s described in Section 7; the 2-tuple 
Subrelation Lemma can be extended to the case in which o is an embedded MVD, by an 
argument analogous to the one given for the case in which o is an M VD). Using the same 
construction as in the Proof of Theorem 1 (combined with Proposition 2 (a) and the above 
observation), we obtain two distinct two-tuple relations R, R’ such that w Raa xR), 
a (R)=n (R’), and R, R’ satisfy all the FD’s and JD’s in > and all the FD’s in Zp Then it is 
easy to see that we can pick, for each EFD o in , a function f such that both R and R’ satisfy 


o with witness f This shows that XY, Y are not complementary. 
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If (b) is false, then (YUY)*#U, where (XUY)T is the closure of XUY wrt Zp Let R, R 
be two one-tuple relations such that R[W|=R'[Wl=a for W in (XU Y)*, and R[W#R'[W] 
otherwise. Clearly, R#R’, 1 y(R)=7 y(R'), 7 (RJ=2 y(R’), R, R’ satisfy all FD’s and JD's in &, 
and moreover by picking as the witness of an EFD Z->B in & a function f7_, p as in the Proof 
of Proposition 1, we see that R, R’ also satisfy the EFD’s in 2. This shows that X, Y are not 


complementary, and the proof is complete. | 


Intuitively, Theorem | stated that, if the only dependencies present are F'D’s and JD’s, then 
the only way to reconstruct a database from two projections is by join. Theorem 6 states that, if 
EFD’s are also present, then the only way is to join the two projections and then explicitly 


compute the information which is still missing. 
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- 6. Conclusions and Directions for Further Research 


In this work, we have studied some of the computational problems arising when one 
considers applying, in the context of the relational model, the methodology suggested by 
Bancilhon and Spyratos for translating view updates. We discovered that certain important 
problems such as testing translatability and determining a complement which renders an update 
translatable, although solvable in polynomial time (Theorems 3, 6, 8, 9), exhibit an interesting 
kind of inherent complexity (Theorems 4, 5, 7), which indicates the existence of limitations on 
how efficiently they can be solved. However, we have only concentrated on a very simple case of 
the application; we feel that much remains to be done before a reasonable account of the 
applicability of the methodology can be attempted. In particular, the following possibilities seem 


to us to be worthy of further investigation: 


(1) Allowing more general dependencies: In particular, it would be interesting to see to what 
extent can Theorem 1 be generalized, especially in view of the negative result of [V1]. More 
importantly, though, one should study the problem of testing translatability and designing a 
translation (recall that we found the translation of deletions to be trivial just because we only 
considered functional dependencies). It is conceivable that our basic idea of a chase-type 


algorithm will be useful, although we cannot see to what extent. 


(2) Considering views that are a restriction of a projection (i.e. of the form o pm y, where P is a 
predicate on tuples): It should be noted that most of the views occuring in practice are actually 
of the above form. The complement here can be a pair of views, ¢.g. (op, Opwy) oF 
(opty, wy), where my is a complement of wy. We believe that, in the case of only 
functional dependencies (which is still very important from a practical viewpoint), our basic 


approach can be used with only simple modifications (at least for certain P's). 


(3) Considering multi-relation databases with views that are projections of joins of relations: 


this is most important, given that the universal relation assumption is being criticized as 


unrealistic. We also belicve that this is likely to be the theoretically most interesting direction. 


(4) Studying the explicit functional dependencies: It seems to us that /FD’s are a step in the 
right direction, if one wants a model capable of capturing the information content of database 
mappings. We have already examined their influence on complementarity of views (Theorem 
10). Their effect on issues like testing translatability or designing a translation (perhaps in 
conjunction with refining our definition to capture more semantics) is a question which we feel 


deserves further research. 
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